Release 10.1A: OpenEdge Getting Started:
Database Essentials
Indexing basics
This section explains the basics of indexing. Topics include:
How indexes work
A database index works like a book index. To look up a topic, you scan the book index, locate the topic, and turn to the pages where the information resides. The index itself does not contain the information; it only contains page numbers that direct you to the pages where the information resides. Without an index, you would have to search the entire book, scanning each page sequentially.
Similarly, if you ask for specific data from a database, the database engine uses an index to find the data. An index contains two pieces of information—the index key and a row pointer that points to the corresponding row in the main table. Figure 2–6 illustrates this using the Order table from Sports 2000 database.
Figure 2–6: Indexing the Order table
![]()
The index table entries are always sorted in numerical, alphabetical, or chronological order. Using the pointers, the system can then access data rows directly and in the sort order specified by the index.
Every table should have at least one index, the primary index. When you create the first index on any table, OpenEdge assumes it is the primary index and sets the Primary flag accordingly. In Figure 2–6, the Order-Num index is the primary index.
Reasons for defining an index
There are four significant benefits to defining an index for a table:
- Direct access and rapid retrieval of rows.
The rows of your tables are physically stored in the sequence the users enter them into the database. If you want to find a particular row, the database engine must scan every individual row in the entire table until it locates one or more rows that meet your selection criteria. Scanning is inefficient and time consuming, particularly as the size of your table increases. When you create an index, the index entries are stored in an ordered manner to allow for fast lookup.
For example, when you query for order number 4, OpenEdge does not go to the main table. Instead, it goes directly to the Order-Num index to search for this value. OpenEdge uses the pointer to read the corresponding row in the Order table. Because the index is stored in numerical order, the search and retrieval of rows is very fast.
Similarly, having an index on the date column allows the system to go directly to the date value that you query (for example, 9/13/04). The system then uses the pointer to read the row with that date in the Order table. Again, because the date index is stored in chronological order, the search and retrieval of rows is very fast.
- Automatic ordering of rows.
An index imposes an order on rows. Since an index automatically sorts rows sequentially (instead of the order in which the rows are created and stored on the disk), you can get very fast responses for range queries. For example, when you query, “Find all orders with dates from 09/6/04 to 09/20/04,” all the order rows for that range appear in chronological order.
Note: Although an index imposes order on rows, the data stored on disk is in the order in which it was created. So, you can have multiple indexes on a table, each providing a different sort ordering, and the physical storage order is not controlled by the indexes.- Enforced uniqueness.
When you define a unique index for a table, the system ensures that no two rows can have the same value for that index. For example, if order-num 4 already exists and you attempt to create an order with order-num 4, you get an error message informing you that 4 already exists. The message appears because order-num is a unique index for the order table.
- Rapid processing of inter-table relationships.
Two tables are related if you define a column (or columns) in one table that you use to access a row in another table. If the table you access has an index based on the corresponding column, then the row access is much more efficient. The column you use to relate two tables need not have the same name in both tables.
Sample indexes
Table 2–9 lists some indexes defined in the Sports 2000 database, showing why the index was defined.
Disadvantages of defining an index
Even though indexes are beneficial, there are two things to keep in mind when defining indexes for your database:
- Indexes take up disk space. (See the "Calculating index size" section.)
- Indexes can slow down other processes. When the user updates an indexed column, OpenEdge updates all related indexes as well. Also, when the user creates or deletes a row, OpenEdge changes all the indexes for that table.
Define the indexes that your application requires, but avoid indexes that provide minor benefit or are infrequently used. For example, unless you display data in a particular order frequently (such as by zip code), then you are better off sorting the data when you display it instead of defining an index to do automatic sorting.
|
Copyright © 2005 Progress Software Corporation www.progress.com Voice: (781) 280-4000 Fax: (781) 280-4095 |